Comments for MEDB 5502, Week 04

Topics to be covered

  • What you will learn
    • Indicator variables for three or more categories
    • Multiple factor analysis of variance
    • Checking assumptions of analysis of variance
    • Interactions in analysis of variance
    • Interactions in analysis of covariance
    • Interactions in multiple linear regression
    • Unbalanced data

Review oneway analysis of variance

  • \(H_0:\ \mu_1=\mu_2=...=\mu_k\)
  • \(H_1:\ \mu_i \ne \mu_j\) for some i, j
    • Reject \(H_0\) if F-ratio is large
  • Note: when k=2, use analysis of variance or t-test

Full moon data

  • Admission rates to mental health clinic before, during, and after full moon.
  • One year of data

Boxplot of full moon data

Descriptive statistics

Analysis of variance table

Tukey post hoc

Creating indicator variables

Running general linear model with all indicator variables

Analysis of variance table with first and second indicators

Irrelevant rows removed

Parameter estimates, 1 of 3

  • 11.458 - 13.417 = -1.959
  • 10.917 - 13.417 = -2.5

Parameter estimates, 2 of 3

  • 11.458 - 10.917 = 0.541
  • 13.417 - 10.917 = 2.5

Parameter estimates, 3 of 3

  • 10.917 - 11.458 = -0.541
  • 13.417 - 11.458 = 1.959
  • \(\ \)
  • Reference category, the category associated with the indicator variable left out of the model.

Using moon as a fixed factor

Removing the unneeded rows

Parameter estimates using Moon as a fixed factor

Live demo, Multiple factor analysis of variance

Break #1

  • What you have learned
    • Indicator variables for three or more categories
  • What’s coming next
    • Multiple factor analysis of variance

Mathematical model

  • \(Y_{ijk} = \mu + \alpha_i + \beta_j +\epsilon_{ijk}\)
    • i=1,…,a levels of the first categorical variable
    • j=1,…,b levels of the second categorical variable
    • k=1,…,n replicates with first and second categories

\(\ \)

  • \(H_0:\ \alpha_i=0\) for all i

\(\ \)

  • \(H_0:\ \beta_j=0\) for all j

Crosstabulation of categorical predictors

Analysis of variance table for moon data

Removing irrelevant rows

Parameter estimates for the full moon model

Tukey post hoc test

Live demo, Multiple factor analysis of variance

Break #2

  • What you have learned
    • Multiple factor analysis of variance
  • What’s coming next
    • Checking assumptions of analysis of variance

Assumptions

  • Normality
  • Equal variances
  • Independence
  • Note: No linearity assumption
    • Only for linear regression and analysis of covariance

Q-Q plot of residuals

Residual versus predicted value plot

Live demo, Checking assumptions of analysis of variance

Break #3

  • What you have learned
    • Checking assumptions of analysis of variance
  • What’s coming next
    • Interactions in analysis of variance

What is an interaction

  • Impact of one variable is influenced by a second variable
  • Example, influence of alcohol on sleeping pills
  • Three types of interactions
    • Between two categorical predictors
    • Between a categorical and a continuous predictor
    • Between two continuous predictors
  • Interactions greatly complicate interpretation

Box plots of exercise data

Mean values for the interaction

Analysis of variance table for interaction model

Parameter estimates for the interaction model

Interaction plot, 1 of 2

Interaction plot, 2 of 2

Live demo, Interactions in analysis of variance

Break #4

  • What you have learned
    • Interactions in analysis of variance
  • What’s coming next
    • Interactions in analysis of covariance

A second type of interaction

  • Interactions in analysis of covariance
    • Between categorical predictor and continuous predictor
    • Different slopes within each category

Interaction between exercise program and hours spent exercising

Testing for interaction in analysis of covariance

Table with irrelevant rows removed

Parameter estimates

  • Intercept for prog=1, -8.997 + 2.216 = -6.781
  • Intercept for prog=2, 9.993 + 2.216 = 12.209
  • Intercept for prog=3, 2.216
  • Slope for prog=1, 10.409 + -2.956 = 7.453
  • Slope for prog=2, 9.83 + -2.956 = 6.874
  • Slope for prog=3, -2.956

Live demo, Interactions in analysis of covariance

Break #5

  • What you have learned
    • Interactions in analysis of covariance
  • What’s coming next
    • Interactions in multiple linear regression

Interaction between hours and effort

effort: 
    label: weekly effort scores
    note: self report
    scale: 0 through 50
    direction:
      0 denoting minimal physical effort and
      50 denoting maximum effort.

Analysis of variance table

10

Live demo, Interactions in multiple linear regression

Break #6

  • What you have learned
    • Interactions in multiple linear regression
  • What’s coming next
    • Unbalanced data

1

2

3

Live demo, Unbalanced data

Summary

  • What you have learned
    • Indicator variables for three or more categories
    • Multiple factor analysis of variance
    • Checking assumptions of analysis of variance
    • Interactions in analysis of variance
    • Interactions in analysis of covariance
    • Interactions in multiple linear regression
    • Unbalanced data

Additional topics??